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ABSTRACT 

We present a new method to estimate three-point correlations in Cosmic Mi- 
crowave Background maps. Our Fast Fourier Transform based implementation 
estimates three-point functions using all possible configurations (triangles) at a 
controlled resolution. The speed of the technique depends both on the resolution 
and the total number of pixels N. The resulting N log N scaling is substantially 
faster than naive methods with prohibitive N 3 scaling. As an initial applica- 
tion, we measure three-point correlation functions in the First Year Data Release 
of the Wilkinson Anisotropy Probe. We estimate 336 cross-correlations of any 
triplet of maps from the 8 differential assemblies, scanning altogether 2.6 mil- 
lion triangular configurations. Estimating covariances from Gaussian signal plus 
realistic noise simulations, we perform a null-hypothesis testing with regards to 
the Gaussianity of the Cosmic Microwave Background. Our main result is that 
at the three-point level WMAP is fully consistent with Gaussianity. To quantify 
the level of possible deviations, we introduce false discovery rate analysis, a novel 
statistical technique to analyze for three-point measurements. This confirms that 
the data are consistent with Gaussianity at better than l-er level when jointly 
considering all configurations. We constrain a specific non-Gaussian model using 
the quadratic approximation of weak non-Gaussianities in terms of the /nlt pa- 
rameter, for which we construct an estimator from the the three-point function. 
We find that using the skewness alone is more constraining than a heuristic sub- 
optimal combination of all our results; our best estimate is /nlt = —110 ± 150 
assuming a ACDM concordance model. 

Subject headings: cosmic microwave background — cosmology: theory — meth- 
ods: statistical 
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1. Introduction 

The temperature fluctuations in the Cosmic Microwave Background (CMB) arc Gaus- 
sian to a high degree of accuracy (Komatsu et al. 2003a). Non-Gaussianity, if any, enters at 
a highly subdominant level. It could be either primordially generated along with Gaussian 
fluctuations by exotic inflationary models, and/or it could arise from secondary anisotropics, 
such as gravitational lensing, Sunyaev-Zel'dovich (SZ), or Sachs- Wolfe (SW) effects. Quan- 
tifying the degree and nature of non-Gaussianity in the CMB constrains specific inflationary 
models, as well as enhances our understanding of the secondary processes the CMB un- 
derwent beyond the surface of last scattering. Interpretation of any such measurement is 
complicated by the fact that systematics and foreground contaminations might also produce 
non-Gaussian signatures. 

Given the nearly Gaussian nature of the CMB, Appoint correlation functions, and their 
harmonic counterparts, polyspectra, are the most natural tools for the perturbative un- 
derstanding of non-Gaussianity. If it were generated by inflationary models admitting a 
$ 2 term, the leading order effect would be a 3-point function. On the other hand some 
secondary anisotropies, such as lensing, are known to produce 4-point non-Gaussianity at 
leading order (Bernardeau 1997). The skewness (or integrated bispectrum) was measured 
by Komatsu et al. (2003a) and 3-point correlation function by Gaztanaga & Wagg (2003); 
Eriksen et al. (2005). 

Many alternative statistics have been used to investigate non-Gaussianity in CMB. A 
partial list includes wavelet coefficients (Vielva et al. 2004; Mukherjee & Wang 2004; Liu 
& Zhang 2005; McEwen et al. 2005), Minkowski functionals (Komatsu et al. 2003a; Park 
2004), phase correlations between spherical harmonic coefficients (Naselsky et al. 2005), 
multipole alignment statistics (de Oliveira-Costa et al. 2004; Copi et al. 2004; Slosar & 
Seljak 2004), statistics of hot and cold spots (Larson & Wandelt 2005; Tojeiro et al. 2005; 
Cruz et al. 2005), higher criticism statistic of pixel values directly (Cayon et al. 2005). Most 
of these measurements are consistent with Gaussianity, although some claim detections of 
non-Gaussianity up to 3-cr level. These alternative statistics, albeit often easier to measure, 
typically depend on Appoint functions in a complex way, thus they cannot pin-point as 
precisely the source of non-Gaussianity. 

Among the three-point statistics, there is a perceived complementarity between har- 
monic and real space methods. The bispectrum can be relatively easily calculated for a full 
sky map (Komatsu et al. 2002), although the present methods have a somewhat slow N 5 ^ 2 
scaling (Komatsu et al. 2003b). Methods put forward so far use the "pseudo-bispectrum" , 
ignoring the convolution with the complicated geometry induced by galactic cut and cut-out 
holes. In contrast with harmonic space, the corresponding pixel space edge effect correc- 
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tions are trivial (Szapudi et al. 2001), since the window function is diagonal. Unfortunately, 
simple methods to measure three-point clustering exhibit a prohibitive iV 3 scaling if the full 
configuration space is scanned. To remedy the situation, most previous measurements of the 
3-point function only deal with an ad-hoc sub-set of triangular configurations (Gaztahaga & 
Wagg 2003; Eriksen et al. 2005). Both of these papers covered the full configuration space on 
small scales; the former paper also appears to have estimated most configurations on large 
scales, missing intermediate configurations with mixes scales. 

This work presents a novel method, which, at a given resolution, scans the full avail- 
able configuration space for 3-point level statistics using realistic computational resources. 
We find that the resulting configuration space itself is overwhelming to such a degree that 
interpretation of the results also requires novel methods. We introduce false discovery rate 
(FDR) technique as a tool to interpret three-point correlation function measurements. 

The next section introduces our algorithm to measure the 3-point correlation function, 
§3 illustrates it with an application to the WMAP first year data release, and §4 introduces 
the FDR method and applies it to our results. We summarize and discuss our results in §5. 



2. Measuring the three point correlation function 

The three point correlation function (e.g., Peebles 1980) is defined as a joint moment 
of three density fields ( = (<5 <5i5 2 ) at three spatial positions. For CMB studies Si denotes 
temperature fluctuations at position i on the sky, and () stands for ensemble average. If the 
underlying distribution is spatially isotropic, ( will only depend on the shape and size of a 
(spherical) triangle arising from the three positions. A number of characterizations of this 
triangle are possible and convenient. The most widely used are the sizes of its sides (measured 
in radians), or two sizes and the angle between them. This latter angle is measured on the 
spherical surface of the sky. 

One can use the ergodic principle of replacing ensemble averages with spatial averages 
to construct a nearly optimal, edge corrected estimators with heuristic weights (Szapudi & 
Szalay 1998; Szapudi et al. 2001, 2005) 

C(A) = ^ U (!) 

where we symbolically denoted a particular triangular configuration with A (any parametriza- 
tion would suffice), and /A fc oc 1 if pixels (i,j,k) G A, and otherwise. We also defined a 
multiplicative weight Wi for each pixel: this is if a pixel is masked out, and it could take 
various convenient values depending on our noise weighting scheme if the pixel is inside the 
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survey; e.g., in the case of flat weights it is simply 1. This simple estimator has been widely 
used in large scale structure, and it is nearly optimal with appropriate weights, (e.g., Szapudi 
& Szalay 1998; Kayo et al. 2004). It is entirely analogous to the successful estimators used 
for the measurements of the Q's for the CMB (up to harmonic transform, Szapudi et al. 
2001; Hivon et al. 2002). 

The naive realization of Equation 1 has a prohibitive iV 3 scaling if one needs to scan 
through triplets of pixels and assign them to a particular bin. The summation can be re- 
stricted and thus made faster if one restricts the number of configurations and the resolution 
(e.g., Szapudi et al. 1999b; Barriga & Gaztanaga 2002; Gaztanaga & Wagg 2003), or it can 
be sped up by using tree-data structures (Moore et al. 2001). Neither of these methods is 
able to scan through all possible configurations in megapixel maps with reasonable amount 
of computing resources. Here we propose a new method which uses both hierarchical pix- 
elization and Fourier methods motivated by Szapudi (2004); Szapudi et al. (2005) to scan 
through all the triangles simultaneously. Note that Gaztanaga & Wagg (2003) comes closest 
to our aims, but their simple two-step approach is not systematic enough to cover all possible 
triangles at a given resolution, and it is not fast enough for massive Monte Carlo simulations. 

In the following we will choose a parametrization of the triangle A using two of its 
sides 61,62, and the angle a between them. We define the configuration space as a set of 
(logarithmic) bins for the sides, and linear bins for the angle in their full possible range, 
i.e., 0,7r (remember that the sides of the triangle on the sky are also measured in radians). 
The given resolution is determined by the number of bins for it and the number of bins 
for a. Note that a particular triangle might appear more than once in this scheme, albeit 
with different resolutions. Different triangular bins of the three-point function are strongly 
correlated anyway, and the correlation from duplicating triangles can be taken into account 
in the general statistical framework over correlated bins. 

Given a triangular configuration, and a pixel i, all other pixels which enter the summa- 
tion in Equation 1 are located on two concentric rings of size 0\ and 2 . As a consequence, the 
summation over fixed a can be thought of as an unnormalized (raw) two-point correlation 
function between two rings. To obtain three-point correlation function, one has to multiply 
this two-point correlation function with the value of the center pixel % and finally sum over 
i. 

Calculating the two-point correlation function of rings can be fast if one repixellizes 
the map (c.f., Fig. 1) into rings with sizes matching the binning scheme for 0, and uniform 
division in a. Such a repixellization, resulting in ring-pixels as shown in Fig. 1, would take 
only iV steps even in a naive way; the HEALPix hierarchical scheme allows it to be done in 
log N time. 



- 5 - 



We use the following algorithm: let us start a recursive tree walk at the coarsest map, 
N S ide = 1 in the HEALPix scheme. For each pixel in this map, we determine, using its 
center, which ring-pixel it would belong to. If the size of the pixel is much smaller than 
this ring-pixel (how much smaller is a parameter of our algorithm: in this paper we used 
the condition that the pixel has to be smaller then 0.2 x the bin width which is also the 
approximate size of the ring-pixels), we record it. If not, algorithm splits the quad-tree, and 
calls itself recursively for each four sub-pixels. This procedure ends at the latest when the 
highest resolution (i.e. the one of the underlying map) is reached. If the bins are chosen 
appropriately such that large ring-pixels are set up for large triangles, for many pixels it will 
finish earlier. As noted above, the map has to be regridded around each pixel into rings of 
ring-pixels. In total, this takes 0(N log N) time. 

Calculating the two-point correlation function between rings speeds up using Fast Fourier 
Transform (FFT) methods, such as those put forward in Szapudi et al. (2001, 2005); Sza- 
pudi (2005). The recipe is the following. First, FFT every ring i to obtain complex co- 
efficients dk(0i); then calculate for every pair of rings the "pseudo power spectrum" 
a k(0i)ai(^j) + a k(@i) a k(9j), where * means complex conjugate. Due to the U(l) symmetry of 
the ring an inverse cos transform will give the (raw) two-point correlation function between 
the two rings (c.f. Szapudi et al. 2005; Szapudi 2005). If we have Ng rings, each of them 
N a ring-pixels, each FFT can be done in N a \ogN a time, and there is N e (N e + l)/2 cross 
correlations to be calculated for a full scan of configurations. All the above needs to be per- 
formed for each pixel as a center point. The total scaling (including the initial regridding) 
takes N(\ogN + N 6 N a logN a + N a N e (N e + l)/4)/2, where we took into account that the 
two opposite pixels can be handled in one go if a symmetric set of bins around 7r/2 are used 
for 9. 

While the above procedure to calculate raw (unnormalized) correlation functions ap- 
pears somewhat complex, we have checked with direct calculation that it gives numerically 
the same result as calculating correlations on the rings in a naive way. In order to obtain 
normalized correlation functions, the same procedure has to be followed for the rings associ- 
ated with weights/masks. Each configuration of the raw three-point function is divided with 
the mask/weight three-point function for the final result. For many realizations with the 
same mask, such as in the case of massive Monte Carlo simulations, the mask correlations 
need to be estimated only once, representing negligible cost. 

The above abstract scheme and calculation will be illustrated and further clarified with 
a practical application to WMAP next. 



Fig. 1. — The repixellization geometry viewed from position above the center and from the 
side of the center. #jS on the side view define circles on the top view that separate the rings. 
The radius on the top view corresponding to big circles on the side view that cut rings to 
the new ring-pixels. 

3. Application to WMAP: raw results 

3.1. Data and Simulations 

We demonstrate our method to calculate the three-point correlation function with an 
application to WMAP. We downloaded first year foreground cleaned maps from LAMBDA 
website 1 (Bennett et al. 2003). There are total 8 maps for 8 Differencing Assemblies (DA) in 
Q, V and W bands: Ql, Q2, VI, V2, Wl, W2, W3, and W4, already in HEALPix format. 
Following the two-point analysis of WMAP (Spergel et al. 2003; Fosalba & Szapudi 2004), 
we only used cross correlations, i.e. three-point correlation functions calculated from three 
different DAs. 

We produced 100 (Gaussian) simulations with SYNFAST in HEALPix package 2 . The 



1 http:/ /lambda. gsfc.nasa.gov/ 

2 http : / / www . eso . org/science / healpix / 
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input power spectrum, also from the LAMBDA website, was taken from ACDM model using 
a scale-dependent (running) primordial spectral index which best fits the WMAP, CBI and 
ACBAR CMB data, plus the 2dF and Lyman-alpha data. Every simulation consists of 8 
assembly maps as the data (Spergel et al. 2003). These 8 maps were generated with a 
same random seed, representing the same primordial CMB, but 8 different beam transfer 
functions. Then different simulated noise maps from LAMBDA web (Hinshaw et al. 2003) 
were added to the SYNFAST output maps. 

Since the non-Gaussian signal is exceedingly small, and on the smallest scales the data 
are noise dominated, we degraded all maps (simulations and data) to N S id e = 256 after 
applying the kp2 mask. More precisely, we added up pixel and weight values for each map; 
our two weighting schemes are presented in the next subsection. 

3.2. Binning and Weighting 

At the heart of our algorithm is the regridding of Figure 1 which matches our binning 
of the triangles. We chose 19 rings for half of the sphere surface, and the same bins are 
repeated on the other half symmetrically around 7r/2. The 19 bins are chosen to be uniformly 
distributed in logarithm between ^/n si d e and |. The number 19 was chosen such that 
0f +1 ~ 29f, which gives a logarithmic resolution of sfl. Every ring was divided to 20 ring- 
pixels in a. This number renders the resulting ring-pixels fairly compact, and it is also 
convenient for our chosen implementation of FFT (FFTW, Frigo & Johnson 2005). 

Weight maps were constructed using the kp2 mask and the noise profile of the maps. We 
used two weighting schemes: flat weighting where Wi is 1 or depending on the mask, and 
(inverse) noise weighting; for the latter (Bennett et al. 2003) we used the effective number 
of observations of the pixel. The weights need to be determined only up to multiplicative 
factor, as their overall normalization cancels from the algorithm. The average noise level <7o 
for each DA is used when combining ( over different cross correlations. 

The total number of triangular configurations in 38 rings with 11 possible values of a 
(angles large than it count to 2n — a) is 38 x 39/2 x 11 = 8151 for autocorrelations. The 
same number is valid for a cross correlations (which we will be exclusively doing) of 3 DAs. 
We introduce the notation (DAI, DA2, DA3), for the central pixel, the first ring and the 
second ring sampled from the three DA in this order. In addition we restrict the "first ring" 
has 9i no larger than that of the "second ring" . Then the total number of cross correlations 
between the 8 DA's is 8 x 7 x 6 = 336. Effectively, each triangle is calculated six times 
for a given triplet of DA's due to the possible 6 permutations. However, each sample has 
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a different resolution, therefore we opt to keep all possibilities. The resulting correlations 
are taken into account when dealing with correlated bins in general. In total, there are 
about 2.6 x 10 6 triangular configurations for each data or simulation set. Note that the total 
number of triplets of ring-pixels examined is 20 x N pix more, or 4.4 x 10 13 . This is still a lot 
smaller than checking 10 18 triplets of pixels naively in an N S i ze = 256 map. These numbers 
suggest that our algorithm even without FFT should take order of days, while the naive 
algorithm would need over 200 years of CPU. 

For a batch of 10 simulations (comprising of 10 x 8 DA maps), the calculation of the full 
three-point function in all the configuration takes about 90 hours on an Intel Xeon 2.4GHz 
CPU. This means each cross-correlation takes only about 2 minutes on average! About 10 
hours are saved by batch processing 10 sets of simulations: to estimate ( in the data alone 
(one set of 8 DA files) took 10 hours. 

Clearly, the given resolution does not extract all the information from the data, as there 
are approximately 0(N 3 ^ 2 ) distinct configurations of the bispectrum or three-point function. 
However, it surely must be redundant to extract more configurations than the amount of 
data. The ratio of data points vs. configurations is about 50 for our chosen bins. It is 
unlikely that it were fruitful to push this number towards much smaller values, although the 
speed of our algorithm would allow higher resolution. 

3.3. The Three-Point Function of WMAP 

Figure 2 shows a typical set of ( measurements for the (W2,W3,W4) DA cross-correlation. 
The results from WMAP lie comfortably in the 68% range of results from Gaussian simula- 
tions. Similar results are found for the other DA combinations, or when all the 24 possible 
W-only DA results are averaged. Although different combinations have different effective 
beam, full averaging is meaningful on large scales. We checked that averaging all 336 pos- 
sible combinations is also consistent with Gaussian. Finally, repeating all the measurement 
with noise weighting produced no obvious departure from Gaussianity either. 

At the same time, the scatter in the simulations, i.e., the probability density function 
(PDF) of ( from 100 simulation, shows a slightly non-Gaussian signature. For (W2,W3,W4), 
Figure 3 shows a histogram derived from all simulated ( values normalized by their measured 
median and 68% levels. Slight deviations from Gaussian distribution are evident: Student 
distribution of degree 3 fits better the overall distribution. This same distribution produces 
lower x 2 when applied to individual triangular configurations according to the inset, i.e. it 
is a (marginally) better fit than Gaussian. We fully take this into account in our hypothesis 
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testing which is described next. 

4. Hypothesis Testing 

Our goal is to test the null-hypothesis of Gaussianity against our measurements by 
means of comparing the ( values measured from the data with the corresponding probability 
distribution function (PDF) determined from Gaussian simulations. A crucial step in the 
traditional x 2 method appears to be computationally infeasible due to the large number of 
configurations: calculation of the (pseudo) inverse of an n x n matrix for n = 2.6 x 10 6 , 
the total number of our highly correlated configurations. Moreover, as seen above, the 
underlying PDF marginally violates Gaussian assumption, even for Gaussian simulations. 
Even if it were possible to calculate the inverse of the covariance matrix, and we were to 
accept the accuracy of the Gaussianity in PDF of the individual bins, it is not possible to 
determine the underlying covariance matrix with sufficient accuracy. In fact, one would need 
(e.g., Pan & Szapudi 2005) at least (and likely much more than) 2.6 million simulations for 
that purpose. Larson & Wandelt (2005) have shown that using simulations with uncorrelated 
noise might result in spurious detection of non- Gaussianity. Therefore we chose to use only 
the WMAP supplied correlated noise simulations, of which 110 is available at present. 

It is straightforward to test the null-hypothesis with a single configuration: we can 
calculate a p- value from the best fit Student distribution from our simulations. The p- value 
is defined as the probability of obtaining a ( value that is at least as extreme as the one 
measured from WMAP. For a threshold pt, the null-hypothesis is rejected at 1 — pt level if 
the p-value of the datum is smaller than p t . A problem arises from combining 2.6 million 
tests when all the data are used. For instance, even if the hypothesis were true, about 260 
bins would still be rejected at the 99.99% level (ignoring the correlations in the data). 

Fortunately a robust and simple method exists for massive hypothesis testing, which is 
insensitive to correlations between the tests, and makes no assumption on the Gaussianity of 
the underlying error distribution: the method of False Discovery Rate (FDR) (Benjamini & 
Hochberg 1995; Benjamini & Yekutieli 2001). In astronomy, it has been successfully applied 
in the context of image processing and finding outliers by Miller et al. (2001), which can 
be consulted for a more detailed introduction. The FDR method combines the same p- 
value as defined above for individual tests using a threshold for rejecting the null hypothesis. 
This combination is insensitive to correlations and has more statistical power than naive 
combination. Our goal is to adapt this powerful method for hypothesis testing of three-point 
correlation function measurements with overwhelming number of configurations. 



-10- 



The FDR method gives a simple prescription for finding a threshold for rejection. In 
particular, the recipe suggests that we choose a threshold such that we control the rate of 
false rejections or FDR. The parameter, taking a similar role to the confidence interval in 
more traditional tests, is the maximum rate of FDR. If we fix an a such that < a < 1, the 
FDR procedure will guarantee 



in ensemble average. 

Next we describe the recipe to control FDR; more details can be found in Miller et al. 
(2001). Let Pi, ...,Pn denote the p-values calculated from the measurements of iV configu- 
rations, sorted from smallest to largest. Let 



where cat is a constant depending on the level of correlations between different configurations. 
For uncorrelated data cjv = 1; while cjy ^ J2iL\ i~ l can be used for correlated data (Hopkins 
et al. 2002). Note that technically one would have to adjust c N to the degree of correlations 
in the data. The suggested value for correlated data is extremely conservative, and should 
be considered as a strong upper limit. Even using this conservative adjustment decreases 
the statistical power of the technique only logarithmically; the final results are expected to 
be robust regardless of the degree of correlations. 

If configurations with i < d are rejected, Equation 2 will hold (Benjamini & Hochberg 
1995), i.e. the FDR is controlled according to our preset parameter a. The procedure is 
represented graphically on Figure 4: Pj is plotted against j/N superposed with the line 
through the origin of slope ajcjq. All p- values reject the null hypotheses which are to the 
left from the last point at which Pj falls below the line. These might include some false 
discoveries which are guaranteed to be a smaller fraction than a in ensemble average. 



We have applied the FDR recipe to all of our individual cross-three point functions, as 
well as our full data set. Since cjy is a constant, initially we kept cjy = 1- For a fixed a, the 
results can be subsequently reinterpreted in terms of any cjv > 1- For DA combination (W2, 
W3, W4), there is no rejection for a < 0.81, i.e., allowing as high as 81% false rejections, 
not a single configuration rejected our Gaussian null hypothesis. Correlations might increase 
c^, but it must be ^ 14. The true a, when correlations are taking into account, can only 



(FDR) < a 



(2) 




(3) 



4.1. Application the WMAP C 
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be larger than our effective a for cn — 1. In other words, the data are fully consistent with 
Gaussianity. 

As a sanity check, we repeated the FDR analysis in our simulations as well. By scan- 
ning through different a values from to 1, we find that 50 out of 100 simulations have 
rejections with a < 0.81. This means that the WMAP measurements are fully consistent 
with Gaussianity at a level better than \-o in the traditional sense. In summary, at the 
three-point level, scanning all configurations, we did not find any significant non-Gaussianty 
which would be localized in pixel space triangular configurations. 

We performed FDR analysis on all 336 measurements individually, as well as on the 
combination of all those measurements with 2.6 million configurations in total. None of these 
cases produced credible evidence for non- Gaussianity and all of them were fully consistent 
with our null hypothesis at a ~ 0.8. 

5. Summary and Discussions 
5.1. Summary 

We presented a new method to measure angular three-point correlation functions on 
spherical maps. We achieve an unprecedented N log iV scaling with a combination of hierar- 
chical and Fourier algorithms. The speed of our technique allows a systematical scan of the 
full available configuration space at a given resolution. Such speed is especially useful for 
cross correlations and Monte Carlo simulations, where a vast number of configurations and 
measurements need to be performed. We have achieved a speed of about 2 minutes per cross 
correlations, when 336 cross-correlations have been estimated simultaneously in N sil i e = 256 
HEALPix maps using a single Intel Xeon 2.4GHz CPU. This is to be contrasted with a naive 
approach, which would have taken about 200 years per cross-correlations; a 20 million fold 
speed up. 

As a first application of our code we analyzed the first year WMAP data along with 100 
realistic simulations. We have calculated cross-correlations for about 2.6 x 10 6 triangular 
configurations, or about 4.4 x 10 13 triplets in total in maps of — 256 corresponding the 
8 DA's. The ratio of pixels/configurations is about 50 for each measurement. 

Comparing our measurements from 100 Gaussian simulations with realistic correlated 
noise, we found WMAP to be comfortably within the 68% percent range for most configura- 
tions. Any significant departure from Gaussianity at the three-point level, even if localized 
in particular triangular configurations, would have shown up clearly in our full scan of the 
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available configuration space. Our main result is that there is no credible evidence of non- 
Gaussianity at the three-point level at any of the triangular configurations we examined. 
As a consequence, if the tentative detection of non-Gaussianity claimed in previous works 
holds up, it should correspond to either 4-point or higher order correlations, or to spatially 
localized features which break rotational invariance (e.g., McEwen et al. 2005; Cayon et al. 
2005). In contrast with our measurements, all previous studies of higher order statistics 
used autocorrelations. Comparison of our errorbars with that of Gaztanaga & Wagg (2003) 
appears to show that this increases the errors by a factor of two (see the discussions below). 
In addition, many measurements used uncorrelated noise simulations. According to the 
findings of Larson & Wandelt (2005), this might increase the likelihood of finding spurious 
non-Gaussianity. 

Analysis of our Gaussian simulations revealed that there is a slight non-Gaussianity 
in the error distribution of individual configurations. This is not surprising, since three- 
point correlation function is a non-linear construction of the Gaussian random variables (c.f. 
Szapudi et al. 2000). The error distribution is well fit by a Student distribution with 3 
degrees of freedom. 

To quantify any possible departure of the overall data set from Gaussianity, we intro- 
duced a new technique, FDR, to interpret three-point statistics. This corresponds to an 
optimized multiple hypothesis testing, and it is insensitive to the unavoidable correlations 
in the data. All of our FDR tests, whether applied to any of the 336 cross correlations, or 
the combined data set, were fully consistent with Gaussianity with better then 1-a . This 
quantifies our previous assertion based on examination of the individual configurations under 
the assumption of statistical isotropy. 



5.2. Discussions: constraining specific models 

The above model independent tests showed that there is no credible evidence of any 
non-Gaussianity in the data. Next we illustrate how our measurements yield constraints on 
specific non-Gaussian models. We choose a simple phenomenological model corresponding to 
the quadratic expansion of the density field in terms of one parameter, /nlt, as put forward 
by (Gaztanaga & Wagg 2003): 

S = S L + f NLT (Sl-{S 2 L )). (4) 
To obtain constraints on this parameter, we construct an estimator for Jnlt 

f = mn ^ 

JNLT fcMfcfa) + + d(Wfl2) 1 } 
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where ( is our measurement in data or simulation maps. We calculated the two-point cor- 
relation function ^ analytically, to avoid any bias from the non-linear construction (c.f. 
Szapudi et al. 1999a). We used the same best fit power spectrum as for the simulations, as 
well as taking into account beam and pixel window functions. Since previous measurements 
already established the weakness of non-Gaussianity, our Gaussian simulations should be 
accurate enough to calculate the variance (Komatsu et al. 2003a). Applying the same esti- 
mator to our 100 simulations, we obtained error bars for Jnlt estimated from each particular 
configuration. 

The simplicity of the phenomenological model lies in the fact that a constant value 
of Jnlt is assumed. We do not attempt to combine our estimates optimally, instead we 
use simple considerations. The signal increases towards small scales in this model, while 
noise dominates on the smallest scales. Since we already discarded the smallest scales when 
using N size = 256, it is intuitively clear that most signal pertaining to this model will be 
concentrated in the small fraction of the triangles corresponding to small scales, in particular 
the skewness. To confirm this we generated and analyzed a set of non-Gaussian simulations 
according to Equation 4, with /nlt equal to 1000, 2000, 3000, 4000, and 6000. Inspection 
of the configurations together with the errorbars from the Gaussian simulations confirmed 
the above idea. Therefore we decided to use the skewness, which corresponds to giving 
zero weight to all other configurations when combining our /nlt estimators. From these we 
obtain 

f NLT ~ -110 ±150, (6) 

where the errorbar was estimated from the Gaussian simulations. Gaztahaga & Wagg (2003) 
quotes similar constraints for a low quadrupole CDM model, but their errorbars are a factor 
of two larger for a ACDM model similar to the one we use. The fact that we obtained a 
factor of two tighter constraints than Gaztanaga & Wagg (2003) suggests that using cross- 
correlations is superior to auto-correlations for three-point statistics of WMAP 

As a sanity check, we calculated the mean value of the skewness estimator for 100 
Gaussian simulations; it yields about /nlt — —8.0. On the other hand, we also demonstrate 
that we can recover /atlt from the non-Gaussian simulations. All the simulations have 
the same underlying Gaussian signal and noise, the only difference is the value of J'nlt- 
According to Figure 5 the errors might be underestimated when J'nlt > 2000, and/or there 
might be a small low bias, but it is clear from the figure that we could detect non-Gaussianity 
if it were present. 

A suboptimal combination of J'nlt estimates from all configurations weighted by their 
inverse variance yields about /nlt ~ —450 ± 500, a significantly weaker result, confirming 
the intuitive idea that most of the signal is concentrated on small scales. 
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Fig. 2. — ( values of all 8151 configurations for the cross correlation (W2,W3,W4). Top: 
the light gray points show the middle 68% range estimated from 100 Gaussian simulations, 
the dark gray points correspond to ( from WMAP. Bottom: zoom on the details inside the 
small box in the top. 
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Fig. 3. — Histogram of ( values of all 100 simulations for all configurations in (W2, W3, 
W4). Each value has been renormalized by subtracting the median, and divided with a 
calculated from 68 percentiles. The two curves correspond to best fitting Gaussian (dots) 
and Student function (solid line) with 3 degrees of freedom. Inset histogram of x 2 differences 
between fitting a Gaussian or Student distribution for each bin. According to the figure, most 
configurations are better fitted by a Student distribution as evidenced by the larger \ 2 f° r 
the former; however, Gaussian is still a reasonable fit. 
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Fig. 4. — The sorted p values vs. lines with slope a/c^. The p values shown in the curve 
are for all configurations of our typical DA combination (W2, W3, WA). The lower line is 
for a = 0.8 and produces no rejection. On the other hand, the upper line is for a = 0.9 
produces many (8000) rejections, probably all of them false discoveries (see text). 
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f NLT input 

Fig. 5. — Comparing the measured /nlt values, using skewness, with the input /nlt values 
for the non-Gaussian simulations. The straight line represents the input values. The points 
with error bars are the measured values. All the error bars are the same value from 100 
Gaussian simulations. 



